Skip to content

docs: add HIP/AMD NaN warning for q8_0/turbo3 on large K-norm models#66

Open
brosequist wants to merge 1 commit intoTheTom:mainfrom
brosequist:docs/hip-nan-warning
Open

docs: add HIP/AMD NaN warning for q8_0/turbo3 on large K-norm models#66
brosequist wants to merge 1 commit intoTheTom:mainfrom
brosequist:docs/hip-nan-warning

Conversation

@brosequist
Copy link
Copy Markdown

Summary

  • Adds a prominent WARNING block to docs/turboquant-recommendations.md documenting observed NaN divergence when using q8_0 or turbo3 on models with large K-vector norms (e.g. Qwen2.5-7B) on AMD/ROCm (HIP) backends.
  • Includes recommended mitigations: switch to turbo2/turbo4, or add pre-quantization K-norm clipping.

Test plan

  • Docs-only change, no code to test.

🤖 Generated with Claude Code

Adds a prominent WARNING block to turboquant-recommendations.md documenting
the observed NaN divergence when using q8_0 or turbo3 compression on models
with large K-vector norms (e.g. Qwen2.5-7B) on AMD/ROCm (HIP) backends.
The root cause is the int8 overflow path that differs between HIP and CUDA.
Recommended mitigations: switch to turbo2/turbo4 or add pre-quantization
K-norm clipping.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant